chinese corpus造句
造句与例句手机版
- Classified study on inconsistency of segment for chinese corpus
中文语料库分词不一致的分类处理研究 - On construction of a chinese corpus bused on semantic dependency relations
基于语义依存关系的汉语语料库的构建 - It implies that the method of svm + tbl to the corpus of conll2000 and chinese corpus of our definition
最后本文把svm + tbl的方法应用在conll2000英文语料和我们定义的中文语料上。 - Two kinds of knowledge sources are explored : one is expert knowledge and the other is a small dialectal chinese corpus
方言相关的知识源有两种,一是专家,一是小规模的方言背景普通话数据库。 - Juang der - ming , hsieh ching - chun , and lin hsih have pointed out the extent of difficulties and how they needed to be addressed , particularly in ancient chinese corpora
庄德明、谢清俊和林晰指出了该问题的复杂性和困难所在? ?尤其是对于汉语古籍语料。 - The institute of computational linguistics , peking university has completed the basic processing of a contemporary chinese corpus that has 27 million chinese characters
摘要北京大学计算语言学研究所已经完成了一个有2700万汉字的现代汉语语料库的基本加工。 - Characters missing as a result of code change , particularly in processing classical chinese corpora , is the major hurdle in any large - scale chinese language processing
用计算机处理汉字资料时,常有些字的字形是交换码中没有的,这情形在古籍中特别严重。 - Sun maosong and zuo zhengping have presented a word segmentation algorithm based on a large chinese corpus . the approach may be beneficial to understanding unrestricted chinese texts
他们给出了一个基于大规模语料的歧义切分算法,该方法有助于理解非受限中文文本。 - We present an extensive experimental evaluation of refined concept index on two english collections and one chinese corpus using state - of - the - art support vector machine classifier
因此,在大规模文本分类应用中,特征选择算法往往更受欢迎。不过,概念索引却是一个例外。 - Now , we have written a lot of software for chinese corpus processing , and have gained great achievements . but the outcome of them cannot answer our needs very well , and needs further improvements
当前对汉语语料的加工结果,虽已取得了一定的成绩,但国家的评测结果表明,其离实际需要的差距还是很大的,还有待于进一步的提高。 - It's difficult to see chinese corpus in a sentence. 用chinese corpus造句挺难的
- Part - of - speech tagging is a fundamental theme in natural language processing . it is significant to the tagging of chinese corpus - based , machine translation and information indexing of large scale text
词性标注是自然语言处理中的一项基础性课题,词性标注的正误对汉语语料库标注、机器翻译和大规模文本的信息检索等都有重要的意义。 - I divide maximal crossing ambiguities into three sorts based on statistics of maximal crossing ambiguities from a large scale chinese corpus and adopt different methods to deal with them . this modified algorithm improves the ability to deal with maximal crossing ambiguities greatly
本课题在对大规模真实文本中的最大交集字段进行统计的基础上,将最大交集字段分为三类,并分别对其进行处理,极大的提高了对最大交集字段的处理能力。 - The performance of the existing word segmenters for dealing with this type of ambiguity is still not satisfactory . in the paper , cross ambiguities in chinese running texts are described quantitatively and systematically based on the basis of observations from a chinese corpus with 100m chinese characters
本文根据一个1亿字的大型汉语语料库和一个包含1 1万词的汉语词表,对交集型歧义切分字段进行了穷举式的调查以及多角度、多层次的统计分类。 - The building of corpus is the basic work in the area of chinese information processing . the processing of chinese corpus includes chinese word segmentation and part - of - speech tagging . they are widely used in many researches ( for example , the automatic searching of chinese text , machine translation , and chinese characters identification and so on ) , and they provide important study resources for these researches
自动分词和词性标注在很多现实应用(中文文本的自动检索、过滤、分类及摘要,中文文本的自动校对,汉外机器翻译,汉字识别与汉语语音识别的后处理,汉语语音合成,以句子为单位的汉字键盘输入,汉字简繁体转换等)中都扮演着关键角色,为众多基于语料库的研究提供重要的资源和有力的支持。 - The problem is critical since , in the classical chinese corpus developed by academic sinica in the past 14 years , there are more than 9 , 600 chinese characters without appropriate codes . in this paper , we present a database of chinese graphemes through which the structure of any missing characters as well as their attributes can be represented
目前,对于继承汉文化的地区来说,缺字问题已是一个共同的梦魇,凡是遇到汉字的人名、地名、史料等等,都有相当严重的缺字问题;所以,缺字问题已是一个国际性大家都关心的问题。 - Firstly , for the errors of text ’ character and word , utilizing neighborship of character or word , check character and word errors by character string co - occurrence probability . secondly , for the errors of syntax of text , according to statistic and analysis of a large - scale contemporary chinese corpus , recognize the predicate focus word and the others sentence ingredient , check the syntax errors . thirdly , for the errors of text ’ semanteme , establishing semantic dependency relationship tree based on hownet knowledge , presents a method that based on semantic dependency relationship analysis to compute sentence similarity , check the semantic errors
对于文本字词错误的检查,本文主要利用了字词二元接续关系,根据同现概率检查文本字词错误;对于文本语法错误的检查,本文利用教研室已有的一个大规模语料库,通过对语料库进行统计分析,获得语法查错所需要的语言规律和知识,利用谓语中心词识别和其他句子成分识别的方法,检查文本语法结构上的错误;对于文本语义错误的检查,本文主要利用知网知识得到语义依存树,通过对句子的有效搭配对的相似度计算检查语义错误。 - Abstract : based on the statistical characteristics of chinese maximal noun phrases ( mnps ) in a chinese corpus with 5 573 sentences , two efficient identifying algorithms for chinese mnps : ( 1 ) to identify mnps by using boundary distribution probabilities ; ( 2 ) to identify mnps by using internal structure rules , are proposed in this paper . experimental results show better performances : precision 85 . 4 % and recall 82 . 3 % , by using identifying algorithm ( 2 )
文摘:通过对包含5573个汉语句子的语料文本中的最长名词短语的分布特点的统计分析,提出了两种有效的汉语最长名词短语自动识别算法:基于边界分布概率的识别算法和基于内部结构组合的识别算法.实验结果显示,后者的识别正确率和召回率分别达到了85 . 4 %和82 . 3 % ,取得了较好的自动识别效果
如何用chinese corpus造句,用chinese corpus造句,chinese corpus in a sentence, 用chinese corpus造句和chinese corpus的例句由查查汉语词典提供,版权所有违者必究。